2023-10-18
Broadly, it is the process of automatically pulling data from websites by reading their underlying code. Doing this gets complicated fast:
It is difficult to predict how much time a web scraping task will take.
Sites might change, introducing need to update.
Site maintainers may not be okay with data being scraped. Quick plug for the TECH team’s Automated Data Guidelines.
Since Spring 2023, states have been dis-enrolling Medicaid beneficiaries who no longer qualify since the Public Health Emergency was ended.
In anticipation of “the great unwinding,” many states implemented policy changes to smooth the transition.
To understand the success of these policies, we wanted time-series enrollment data for all 50 states… from a Medicaid data system that is largely decentralized.
Why page through PDFs when another organization’s RAs can do it for you?
One URL with data you can only get by clicking each option!
Whenever new data were released in the following 2 months, I re-ran the code and got a well-formatted excel file as output.
2 months later, KFF stopped updating the dashboard and changed how existing data was reported on graphs.
States are increasingly interested work-based learning (WBL) as an important strategy for helping students prepare for and access good jobs, but measurement has been limited
To understand the prevalence and types of WBL, we wanted course-level data from community colleges across the country